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A wide variety of numerical methods are evaluated and compared for solving the stochas- 
tic differential equations encountered in molecular dynamics. The methods are based on 
^"^ the application of deterministic impulses, drifts, and Brownian motions in some combina- 

CS| tion. The Baker-Campbell-Hausdorff expansion is used to study sampling accuracy following 

^ recent work by the authors, which allows determination of the stepsize-dependent bias in 

"^ configurational averaging. For harmonic oscillators, configurational averaging is exact for 

"^" certain schemes, which may result in improved performance in the modelling of biomolecules 

where bond stretches play a prominent role. For general systems, an optimal method can 
fC be identified that has very low bias compared to alternatives. In simulations of the alanine 

' dipeptide reported here (both solvated and unsolvated), higher accuracy is obtained with- 

n out loss of computational efficiency, while allowing large timestep, and with no impairment 

^ of the conformational exploration rate (the effective diffusion rate observed in simulation) . 

C/^ The optimal scheme is a uniformly better performing algorithm for molecular sampling, with 

O 
•▼^ overall efficiency improvements of 25% or more in practical timestep size achievable in vac- 

^^ uum, and with reductions in the error of configurational averages of a factor of ten or more 

Qh attainable in solvated simulations at large timestep. 

> 

OS 

^ I. INTRODUCTION 

"^ One of the major challenges in understanding matter at the molecular scale is the problem 

^^ of thermodynamic sampling: the calculation of averages with respect to the canonical (Gibbs- 

^ Boltzmann) distribution. In many cases the aim is to sample configurational quantities only, and 

^ this is the focus of this article. Given the classical molecular potential energy function U : R^^ -^ R, 

H 

^ the configurational canonical density is 



-pp{q)^Z-'e-P^^'^)^ 
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FIG. 1: The computed potential energy distribution is shown for the method of Briinger, Brooks and Karplus 
applied to a single alanine dipeptide protein at 300K in a vacuum using the CHARMM22 forcefield; the 
energy distribution becomes distorted as the step size increases. 

where /3~^ = kBT where fc^ is Boltzmann's constant, T is temperature, and Z is a normalization 
constant so that p^ has unit integral over the entire configuration space. In using molecular dynam- 
ics to sample the phase space according to the canonical distribution, the formulation employed 
may not be ergodic (meaning that it may not sample the entire phase space) and, moreover, the 
design of the time-discretization methods typically distorts the equilibrium distribution. Using a 
stochastic differential equation model such as Langevin dynamics, which introduces random pertur- 
bations into each force component, we can overcome the first of these problems, as the formulation 
is well known to be ergodic. As an illustration of the effect of step size error, see Figure 1, where 
the potential energy in simulations of alanine dipeptide is shown to be corrupted by a popular 
Langevin discretization. 

Given that the majority of computational work in any MD algorithm lies in the force calculation, 
most of the existing methods in common use have been designed to require only one force evaluation 
per timestep. For timestepping methods that accurately sample the canonical distribution, the 
available timescales for simulation are restricted by the problem itself (e.g. the heights of barriers, 
or entropic properties of the landscape). In designing a new molecular dynamics algorithm the goal 
is to enlarge the usable timestep in order to allow finite trajectories to access a larger portion of the 
phase space. The drawback of working in the high-timestep regime is that for long-time simulations 
the computed probability distribution is a perturbation of p^, dependent on the step size, leading 
to a distortion in calculated averages. The simulator must choose the step size sufficiently small 
enough to avoid corruption in averages, but still large enough to ensure a thorough exploration of 
configuration space. 

The potential energy function in molecular dynamics determines the maximum allowable step 



size. For a harmonic oscillator with frequency uj the Verlet method has a stability restriction of 
5t < 2/uj [1]. Most numerical methods (including ones constructed for Langevin dynamics) suffer 
from a similar limitation in the maximum timestep size driven by the presence of stiff oscilla- 
tory solution components. However, well before reaching the stability threshold, averages may be 
severely corrupted, introducing artificial-and, often, severe-step size restriction. By removing or 
reducing this bias, it becomes possible to substantially increase the timestep, with a direct impact 
on the efficiency of simulation. Given the explosion in the use of molecular dynamics in chemistry, 
physics, engineering and biology, it is worth noting that where molecular dynamics is used for 
round-the-clock configurational sampling calculations, a quantifiable improvement in method effi- 
ciency (or timestep size) directly translates to a reduction in machine costs, a reduction in energy 
costs, and, often, a reduction in delays to publication. 

With regard to the error in averages, it is usually assumed that the error due to having insuffi- 
cient samples dominates the timestep-dependent discretization error, but this is not typically the 
case at large step size, as we demonstrate in numerical experiments (see Section V). To dramatize 
the role of discretization error, we show in Figure 2 the example of the configurational distribution 
for a simple two-basin model solved using two different numerical methods. The figure illustrates 
that where step size error is substantial, crucial features of the landscape such as the heights of 
free energy barriers, may be completely altered. Moreover, it is entirely possible for the relative 
heights of different barriers to be altered in such a way that one transition becomes more prevalent 
than another. 

Theoretical analysis of the error in the invariant distribution can be performed for harmonic 
oscillators without great difficulty (see Section III), but it can also be carried out for general 
nonlinear problems. This is most straightforward in the case of splitting methods. In this article 
we draw on principles of geometric integration, building on our understanding of splitting methods 
from the deterministic setting [2] . Splitting methods for Langevin dynamics have been considered 
in the past [3-10] but a wide variety of schemes can be constructed by splitting and until now the 
rational basis for selecting one scheme over another has been absent. Drawing on the work of Talay 
and Tubaro [11], we have studied the generator of the numerical method directly by examining 
expansions for the invariant measure of the Langevin dynamics scheme [12]; this investigation lead 
to the concept of the associated density p of the numerical method: 

piq,p) oc exp (-/? [Hiq,p) + St^f2{q,p) + St^f^q,?) + ...]). (1) 

This expansion allows different methods to be compared on a rational basis (as concerns the effect 
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FIG. 2: The configurational density function Pi3{q) is shown for a planar two-basin potential, computed 
using two different Langevin dynamics methods at various (stable) timesteps increasing from left to right. 
The color indicates the value of the computed probability density: from high (red) to low (blue) over the unit 
square. The methods have the same computational cost (in terms of force evaluations), but give different 
results at large timesteps. Details on this computation are given in Appendix A. 



of discretization error). In the past, for deterministic methods, this type of analysis has also been 
used for the correction of averages [13, 14]. In typical cases which would be relevant for molecular 
simulation, the error introduced in averages using such methods would be second order in the 
timestep (i.e. would go to zero quadratically as the step size is reduced). 

For a particular ordering of the building blocks of the numerical method, a "superconvergence" 
(cancellation) property can be obtained in the high friction limit, meaning that in fully resolved 
molecular dynamics simulations and after integrating out with respect to momenta, the leading 
term in the expansion vanishes [12]. This theoretical convergence result was until now only studied 
for relatively simple model problems. Moreover the crucial question of the step size (stability) 
threshold of the different schemes (as well as the overall efficiency of the various methods) cannot 
be addressed using the asymptotic technique since it provides information only about the small 
step size limit {St -^ 0). 

Another important issue raised by practitioners concerns the fact that such a super convergent 
method, relying on large friction, might not be useful in realistic settings since it is known that 
large friction can reduce the diffusion rate. The problem is complicated by a number of issues: 
both friction coefficient and step size affect the long-term averaging error differently for different 
methods, and the friction coefficient (and, in principle, the step size) may affect the diffusion rates 
differently for different methods. Performance is further dependent on the type of problem under 



study. Thus there is a need for careful study of the methods in the context of systems relevant for 
molecular dynamics (for example, containing both steep potentials such as Lennard- Jones and stiff 
bonds). In this article we address both issues: we consider large step size and modest values of the 
friction coefficient, using numerical experiments to carefully examine the relative performance of a 
large number of different methods. 

In recent years, there has been widespread interest in multiscale methods for enhanced sampling 
[15-18] and such methods likely offer the best approach to bridging the timescale gap. We observe 
that work on enhanced numerical schemes for molecular dynamics remains essential as it plays a 
crucial underpinning role in all the enhanced sampling approaches. Improved trajectory generation 
efficiency (e.g. allowing the use of a larger basic timestep in simulation) thus has a knock-on effect 
on the efficiency of all the methods that rely on such trajectories. While relative improvements 
of a few percentages in efficiency can already warrant a minor change in software implementation, 
our analysis points to a more dramatic (even qualitative) difference among various methods leading 
to prospects for much greater efficiencies by selecting a suitable method. These observations are 
verified in model biomolecular simulations. 

Hybrid Monte-Carlo [19, 20], and other schemes based on Metropolis correction [21], are not 
discussed here, although these could be used in conjunction with several of the methods imple- 
mented. The improvement in thermodynamic sampling obtained through the use of more accurate 
Langevin integrators may, in some cases, provide an alternative to Metropolis-based correction in 
the practical setting. All methods under discussion require one force evaluation per iteration, and 
hence have practically of the same computational cost. 

This article proceeds as follows. In Section II we introduce Langevin dynamics in the context of 
configurational sampling and describe our method for examining the long-time behavior of averages 
under discretization of the stochastic differential equations (SDEs). Section III discusses the har- 
monic model problem, showing that for some particular schemes, the configurational sampling can 
be exact; this has implications for molecular simulations involving stiff harmonic bonds. Section IV 
addresses the errors obtained from computed averages in more general systems. Section V contains 
numerical experiments comparing various methods, both for one degree of freedom systems and 
for solvated and unsolvated alanine dipeptide, through implementation of the schemes in a version 
of NAMD [22]. It is our contention that the numerical experiments of Section V provide strong 
evidence for rejecting many of the schemes in common use for stochastic molecular dynamics and 
favor the optimal BAOAB scheme of [12]. 



II. BACKGROUND 

In this article we focus on Langevin dynamics, 

dq = M-^pdt, (2) 

dp = -VU{q) dt - -ipdt + aM^'^ dW, (3) 

where q^p ^ M^^ are vectors of instantaneous position and momenta respectively, W — W{t) is a 
vector of 3A^ independent Wiener processes, 7 > is a free (scalar) parameter and M is a constant 



diagonal mass matrix. By choosing a — ^27/3 ^ it is possible to show that the unique probability 
distribution sampled by the dynamics is the canonical (Gibbs-Boltzmann) density, defined as 

pp{q,p) = 9,-^e-P^^^^P\ (4) 

for total system energy (Hamiltonian) H{q,p) = p^M~^p/2 + U{q)^ and normalization constant 
ri~^ ensuring the integral is unity. We consider numerical methods designed to integrate (2-3), 
primarily for the purpose of generating trajectories that sample p/3. Such trajectories are often 
used as a means for calculating expectations of functions purely of the position q, and as such the 
dynamical fidelity of computed trajectories is of minor importance compared to the behavior of 
averages in the large-time limit. For such an observable 0, we write the expectation as 

E[cf>{q)]^n-' J J <t>{q)p^{q,p)dqdp^ Z-' J ct>{q)pp{q)dq^ ^irn^T-' J cf>{q{t))dt, 

where the ergodicity of Langevin dynamics ensures a sampling of the desired probability distri- 
bution, and hence the ability to equate the long-time average along a trajectory with the corre- 
sponding spatial average. The challenge comes in integrating (2-3) effectively, and with minimal 
computational cost. 

Given a general potential energy function U(q), we cannot integrate exactly and must evolve 
the dynamics by discretizing in time. Advancing the state requires the use of a numerical method 
which aims to approximate the exact evolution. A distribution of initial conditions p propagated 
using a second-order numerical method will evolve according to the equation 

dp ^* 

where C* may be expressed in the series expansion 

C*^Clj^ + St'^C*2 + Oi5t^), (5) 



where £^0 ^^ ^^^ operator associated to the exact propagation under Langevin dynamics and 6t 
is the step size. The invariant (long-time) distribution sampled by the scheme can in principle be 
obtained by solving the partial differential equation (PDE) >C*p = 0, assuming that the perturbed 
operator £* is known. 

Splitting methods [4-6] offer a simple way of integrating the Langevin dynamics equations; the 
right hand side of (2-3) is divided into pieces, eg. z — f — /i + /2, with each piece is solved exactly 
in sequence. Recent work [12] has shown that numerical methods derived from additive splittings 
of the vector field enable relatively simple computation of a method's characteristic operator. The 
order of integration, and the choice of the splitting will define the method. 

For example, one may break Langevin dynamics into three pieces: 
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which are labelled A, B and O. Each of the three pieces may be solved "exactly": A and B 
correspond to a linear drift and kick when taken individually, while the O piece is associated to an 
Ornstein-Uhlenbeck equation with "exact" solution 



q{t)^q{^), 



pit) = e-TV(O) + -^v/l-e-27tMi/2i?,, 



where Rt ^ A/'(0, 1) is a vector of uncorrelated noise processes. (By "exact" we mean that this 
random map generates the probability distribution p{q,p, t) defined by the solutions of the Ornstein- 
Uhlenbeck equation.) 

Given the pieces of the splitting we code a method by giving the sequence of integration, from 
left to right. The string "ABO" represents the method obtained by solving first the "A" part 
for a timestep 5t, then the "B" part, and finally the "O" part of the system. Where a symbol 
is repeated, as in "BAOAB," there could be ambiguity in this representation but we will assume 
that the method is symmetric so that all "A" and "B" parts in BAOAB are integrated for a half 
timestep. Additionally, the methods of Bussi and Parrinello [23] (OBABO) as well as gla-1 (BAO) 
and gla-2 (BABO) of Bou-Rabee and Owhadi [7], are equivalent to sphtting methods using these 
pieces. 



An alternate splitting formulation [8] 



dq 




M-^p 


dt + 


dp 










-VU{q)dt - 7pdt + aM^/'^dW 



(7) 



A S 

has been used to define two methods: stochastic position verlet (ASA) and stochastic velocity 
verlet (SAS). 

A technique is outlined in Section IV to calculate the operators £* of such splitting methods. 
For methods not derived from splitting the vector field, it can be more difficult to obtain their 
operators and examine their behavior. We compare the configurational sampling for a number of 
popular schemes in this article, not limiting the scope only to splitting methods. 

We will consider both ABOBA and BAOAB methods [12], the Bussi/Parrinello method [23], as 
well as the Stochastic Position Verlet method [8]. The Langevin Impulse (LI) method [9], the BBK 
method [24] and the method of van Gunsteren and Berendsen (VGB)[25] will also be compared, as 
these are frequently found in commercial software packages (for example, NAMD and GROMACS). 
Two first-order methods, Ermak-McCammon (EM) [26] and Ermak-Buckholtz (EB)[27] will also be 
considered in numerical experiments, for completeness, with each method described in Appendix 
B. 

III. PERFORMANCE OF LANGEVIN ALGORITHMS APPLIED TO THE HARMONIC 

OSCILLATOR. 



We begin by considering the harmonic model problem. Harmonic oscillators are useful in 
the molecular simulation setting not only because they allow analytical determination of effective 
distributions, but also because they can be seen to be relevant to understanding the timestep 
limiting features of models for crystalline solids and biomolecules. 

The one-dimensional harmonic oscillator U{q) = Kq'^/2, g G M and K > 0, is a standard 
test case for Langevin dynamics numerical methods, as many issues of stability and timestep in 
molecular dynamics simulations arise due to harmonic potentials used to model covalent bonds. 
For such a simple system we may explicitly write one iteration of a general numerical method 
evolving the dynamics as [28] 



Qn+l 
Pn+1 



* 



Pn 



+ l^n 



where ^ is a matrix depending only on the step size 5t, the friction coefficient 7, the particle mass 
M and the spring constant K^ while jin is a vector of stochastic processes. 

Let ^ = {^ij) and denote the components of jin by /i^j where z, j G {1,2}. Taking products of 
the update equations, we obtain 



Pn+1 = ^21<?n + ^22:Pn + Mn,2 + 2^21^22^71^71 + 2^21/^71,2^71 + "^^^22 l^n,2Pn 
Qn+lPn+l = ^11^21<?n + '012'022Pn + ^71,1^71,2 



(8) 

(9) 



+ (^11^22 + '012^21 )<?7iP7i + (^11/^71,2 + '02lM7i,l)<?7i + ('022/^71,1 + '^12l^n,2)Pn- (10) 

We then take expectations of (8-10) in the limit n ^ oc, giving simultaneous equations 

{q^) = ^2^(g2) + ^2^(p2) + (^2^ + 2^n^i2(gp) + 2^n(Ai^) + 2^12(AlP), (11) 

(P') = ^il(^') + ^i2(p') + (Ai) + 2^21^22(gp) + 2^21 (A2g) + 2^22(A2P), (12) 

(gp) = ^l^ll^l^2l{q^) + ^12^22(2?^) + (A1A2) + ('011'022 + ^I2^2i)(<?p) 

+ ^ll(A2<?) + ^2l(Al<?) + ^22(AlJ>) + '012 (A2P), (13) 

where we use notation (x) = IE[x^] for x G {^,p}, and (/i^) = E[/x^^^]. The value of {fiix) = IE[/X7i,i Xn] 
will ultimately depend on the "memory" of a scheme's stochastic process /i^, and can be found by 
computing Xn+i fJ^n+i,i and taking expectations, yielding an expression involving K[fin^ifJ^n-i,i] • 

We can hence solve the linear system (11-13) to find the error in long-time averages for a method, 
and its behavior under changes in 6t and 7, relative to the spring constant K^ by comparing the 
numerical and analytic averages, the latter given as 



(qp)* 







For the BAOAB and ABOBA methods, the numerical values (in the long-time limit) are 



(^g2«^(BAOAB) 
/^2\ (BAOAB) 
^^^^ (BAOAB) 



Mr' (1 - ^ 




^^2"^ (ABOBA) 
/^2\ (ABOBA) 
^gp^ (ABOBA) 



Mr ^ (1 - ^ 





surprisingly giving exact values for the configurational average. Both schemes yield the same 
friction-independent upper-bound on the step size of Stmax — ^\pMjK: the (determinisitic) Verlet 
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LI 


K-'p-'-Srp+0{5t^) 


ABOBA 


K-^/3-^ 


VGB 


K-'p-' + '-^!l^5t^ + 0{6t^) 


BBK 


^-V-^(l-^)" 


EM 


K-'P-'{i-m)" 


BP 


^-^r^(i-€f)" 


EB 


K-'p-' + ^, + o{5e) 



TABLE I: The expected long-time computed average of q^ using each Langevin dynamics scheme, for the 
ID harmonic oscihator U{q) = Kq^ /2. For brevity, some results are shown as leading order series in 5t. 

step size threshold. This implies that we can choose any timestep below this limit and still achieve 
perfect sampling of (g^), up to sampling error. This behavior is atypical of Langevin dynamics 
algorithms, for example comparing the Bussi/Parrinello and BBK schemes we find 

-1 



(g2)(BP) 
(/)(BP) 
(^^)(BP) 







^^2N^(BBK) 
^^2^ (BBK) 
(^p)(BBK) 







giving identical second-order errors in configurational averages for this system, with the same value 
of 5tmax- The BBK scheme has a first order error in {p^) that is also friction-dependent. 

The computed configurational averages for each scheme are shown in Table I, while Figure 3 
shows the result of computing the value of {q^) numerically, using a fixed total number of steps 
and varying the step size. Three distinct regimes can be seen: the first-order "Ermak" methods, 
second-order methods and the exact methods (where any error comes solely from sampling error, 
rather than discretization error). 

We find that the method of van Gunsteren and Berendsen [25] is in fact 2nd order accurate 
for configurational sampling, not 3rd order as reported by those authors; we attribute this to a 
different notion of accuracy being used in that article. 

IV. ERROR ANALYSIS FOR GENERAL SYSTEMS 



Repeating the analysis in Section III for a more general U{q) in a higher-dimensional setting 
is a challenging task for complicated schemes, but we do have a recently developed framework for 
carrying out the calculations. As we have noted previously we may define the invariant distribution 
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FIG. 3: The numerically-computed average error in {q^)^ using the ID harmonic oscillator with a given 
Langevin Dynamics method. Computation was fixed at 10^ total force evaluations, with M = 7 = /3 = 
K = 1. The results are averaged over 2000 independent repeat runs, with error bars included to give the 
standard deviation in these results. The exactness property of the ABOBA and BAOAB schemes result in 
the error decreasing as step size increases due to sampling error. 



of a second-order method p as the solution of the partial differential equation 

tp = 0. 

Expanding the operator in a perturbation series in terms of the step size 6t and combining equations 
(1) and (5) yields 

±2 /•* 



Equating powers of the step size, we see that the order terms match automatically, leaving the 
leading order perturbation equation to be 



Cl^{ppf2)^r'C*, 



(14) 



By the hypoelliptic property of the exact operator [29], the unique solution to /^ld^ = is oc p^, 
hence the homogenous solution to (14) is simply /2 = c, a constant. Therefore we need only find 
a particular solution /2(^,p) solving (14) in order to find the leading order error in the long-time 
distribution p. Once the perturbation is known, averages may be rebiased accordingly [13], in effect 
increasing the order of the method. Of course /2 may itself be a costly function to evaluate. 
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involving a combination of high order derivatives of U{q), and in the general case this is likely to 
lead to an inefficient method. 

For the ABOBA and BAOAB methods, the right hand side of (14) is 

_^_i^*(ABOBA)^^ = ^ (A,M-^f/(g) - Pp^M-'U"{q)p) + f /M-2f/"(,)V,C/(g) 

_^_l^*(BAOAB)^^ = -^ {A,M-'U{q) - Pp^ M-'U"{q)p) - '-^p^ M-'U"{q)V,U{q) 

where U"{q) :— \/qV'^U{q) is the hessian matrix of mixed partial derivatives. 

Being able to write down equation (14) relies on the calculation of C2, which involves the 
computation of a scheme's perturbed operator (5), characterizing the evolution of a density of 
points in the phase space. This can be a challenge in itself, though for methods that derive from 
splitting the vector field, this operator is easily computed[12] using successive applications of the 
Baker-Campbell-Hausdorff (BCH) formula for products of exponentials [30]. 

As an example, consider the stochastic position verlet (SPV) method, using splitting pieces 

defined in equation (7). The infinitesimal generator associated to each part in (7) is given by 

2 
C\^ = -M-^p • V^0, £^0 = VqU{q) • Vp0 + -fVp (#) + y MAp0. 

Note that the exact operator £^0 ~ ^A + ^s- Using the notation from Section II, we code this 
numerical method as "ASA". The characteristic evolution operator jC*y^^^) is then computed using 

exp (^StC<^^^^^ = exp ({St/2) C\) exp {5tC%) exp {{5t/2) C\) , 

where the BCH formula can be used to simplify the products of exponentials: 

exp(t/:*)exp(t/:*) = exp {t{C\ + Cl) + ^ [C\,Cl] + ^ (\C\, [C\,Cl]] - [C^, [Cl,C\]]) + 0{t''] 

and [C\,Cl] ^ C{C\- C\C\ is the commutator of C\ and C\. 

By splitting the vector field (2-3), and choosing a preferred integration sequence, one can easily 
create and analyse a multitude of Langevin dynamics splitting methods using this technique; though 
it is perhaps surprising how small and subtle changes to the order of each piece's integration can 
yield vastly different average behavior in the long-time limit. This effect is most easily apparent if 
an asymmetry is created when two adjacent letters are swapped in a symmetric method. Symmetry 
ensures that the order of a method is at least two (by the Jacobi identity), while destroying this 
property could hamper stability as well as the order of the method. 
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Once the right hand side of (14) has been computed, solving to find the invariant density is an 
involving task and we do not pursue this here for the methods described above. In the case of the 
ABOBA and BAOAB methods, solutions can be obtained as doubly asymptotic expansions in both 
6t and the reciprocal friction coefficient 7"^; moreover a super convergence property can be demon- 
strated for BAOAB configurational averages implying 4th order accuracy [12]. It is interesting to 
note that in the case of ABOBA, no such cancellation occurs, even though the methods have right 
hand sides that are apparently similar in the leading term. 

The fundamental limitation of the asymptotic approach is that it remains to determine in which 
regime the theoretically obtained features of the perturbed distribution are manifest in simulation. 
Large friction coefficient is known to reduce sampling efficiency, so we would need to work with 
modest values of 7, potentially invalidating the superconvergence property. Likewise the crucial 
issue in many cases is the size of the allowable timestep for simulation, not the asymptotic error 
behavior for small step size. These complexities must be addressed using computer experiment. 

V. NUMERICAL RESULTS 

One of the most important features of a numerical method for ergodic dynamics (such as 
Langevin dynamics) is its preservation of the theoretical global phase space exploration rate. The 
spectral properties of the operator £^0 guarantee that we will explore the entire phase space 
(ergodicity), while the relatively small perturbations to the operator induced by numerical dis- 
cretization are hoped not to significantly alter the rate of search. Ultimately, pushing the timestep 
up is the only way to breach timescale gaps, although this comes at the cost of corruption to the 
long-time averages. 

The self-diffusion coefficient gives a metric quantifying the diffusion rate. It is often used as a way 
to compare the rate of phase space exploration between methods, and typically calculated using the 
integral of the velocity auto-correlation function. However, arbitrary methods can be constructed to 
artificially scale the velocity auto-correlation function, hence giving inaccurate diffusion constants. 

Indeed, calculating the temperature of the system from an average of kinetic energy by 

SNkeT = E [p^M-^p] , (15) 

gives a similar problem. Alternative functions, including functions of q only, can be obtained whose 
averages are proportional to the system temperature [3 1-33]. Such "configurational temperature" 
observables are normally based on the periodic forces of the system in order to work in periodic 
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boundary conditions; in the droplet simulations reported below (i.e. without boundary conditions) 
we used the simpler expression: 

3NkBT = E[q-VU{q)]. (16) 

Were one able to solve the dynamics exactly, the kinetic and configurational temperatures would of 
course be equal. However, using a numerical method in the large-timestep regime we instead sample 
expectations with respect to the perturbed density p, that may introduce discrepancies between 
configurational and kinetic temperatures. It is our view that a configuration-based temperature 
calculation is normally more useful and relevant for assessing the quality of configurational sampling 
methods. 

In a similar way, the speed of exploration of the space should not be determined solely from 
functions of momentum, but should rely on actual barrier crossing rates or times to reach some 
target region of phase space. 

A. One-dimensional double well 

The advantages of performing tests initially on a simple model are that (i) the exact solution is 
known (or can be numerically integrated to arbitrary precision), while (ii) the model's simplicity 
allows us to perform exhaustive computation to refine results and determine asymptotic proper- 
ties. Here we use the algorithms to integrate Langevin dynamics for a one-dimensional model 
with potential function U{q) = (g^ — 1)^ + g', a double-well. This example is well-studied as an 
approximation for modelling a dual state system. We use unit mass, friction and temperature, and 
test a range of step sizes, beginning at St = 0.2 and increasing by 5% until we reach a step size 
where all of the methods are no longer stable. We run 500 independent experiments for each step 
size, with computation fixed at 10^ iterations for each realisation. 

The error in configurational distribution is estimated by dividing the interval [—2,2] into 16 
equal bins, and calculating the observed configurational density in each bin for every computed 
trajectory. The error in the observed densities for each bin are calculated by comparing the absolute 
difference between the observed and exact expected densities (the latter obtained using a high-order 
numerical solver). The overall error in the configurational density is calculated from the root mean 
squared value of these errors. Additionally, we calculate the observed kinetic temperature (15) and 
configurational temperature (16) for each method. The results are shown in Figure 4. 

For the configurational distribution, all the methods shown give a second-order relation in the 
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step size, in contrast to Figure 3. Notably, the BAOAB and ABOBA methods are no longer exact 
for this anharmonic model, while the two first-order methods do not appear at all, as neither of 
them is stable in this region. Of the methods that are stable, the BAOAB method gives both the 
largest usable timestep and the smallest maximum error in the configurational distribution for any 
given timestep. A sample plot of the computed distribution for all schemes at St = 0.25 is also 
given in Figure 4d, it is clear from this that the BAOAB scheme performs exceptionally well. 

Of particular note is the apparent lack of direct correspondence between the errors in configu- 
rational temperature and kinetic temperature. The Bussi/Parrinello scheme is shown to preserve 
the kinetic temperature to a very high degree, with less than a 1% error for 6t < 0.25, while, at the 
same step size, the BAOAB integrator gives more than 10% discrepancy in kinetic temperature. 
However, these results are inverted when looking at configurational sampling accuracy (and config- 
urational temperature). Clearly, if maintaining the configurational averages is the goal, estimating 
the fidelity of the calculation by relying on the kinetic temperature is a risky strategy. A much 
more reliable approach is to make use of the configurational temperature, although even this does 
not give the complete story, since in the example at hand the configurational temperature scales 
with a high power of the step size, while configurational sampling error declines as the second 
power of St. The second order behavior does not contradict our previous observations [12] as we 
are here far from the large 7 limit, but does indicate that the superconvergence property is not the 
key feature at play in the setting of this model problem. 

B. Alanine dipeptide 

In our next experiments, we studied the alanine dipeptide molecule, a classic test case for 
molecular dynamics. We compare computed averages for solvated and unsolvated alanine dipep- 
tide using the BAOAB, ABOBA, van Gunsteren and Berendsen, Bussi and Parrinello, Langevin 
Impulse, Stochastic Position Verlet (SPV) and the Briinger/Brooks/Karplus (BBK) schemes. We 
obtained poor results using the first-order schemes and therefore did not consider them here. To 
provide a means of calculating basline values, we use the stochastic position verlet (SPV) method 
with a small step size, for which the discretization error is essentially negligible. 

We implement each of the methods in the NAMD lite package[22], and observe the effect of 
discretization error (if any) on computed configurational averages. The CHARMM22 forcefield was 
used to compute force interactions. 
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1. Unsolvated 

We simulate the alanine dipeptide molecule (22 atoms) in vacuum at 300K for 2.5ns for multiple 
different step sizes and friction constants, to observe how different simulation parameters affect 
computed averages. Parameters for each run were taken from a 50 x 50 grid, with each point on 



(a) 



10" 



o 

> 



P^ 10" 



(c) 



10" 



o 



10" 




0.2 



0.2 



0.24 
Stepsize 



0.28 



2nd order line 




0.24 
Stepsize 



0.28 




(d) 



0.24 
Stepsize 



-BAOAB 

-ABOBA 

-VGB 

-SPV 

-BP 

-LI 

-BBK 

-EM 

-EB 




BAOAB 

VGB, LI 
ABOBA, SPV 

BBK, BP 



FIG. 4: The BAOAB scheme is shown to significantly reduce configurational discretization errors, which 
distort averages in Langevin Dynamics, for the one-dimensional double- well system with potential energy 
function U{q) = {q'^ — l)'^-\-q. Errors are computed from the average of 500 independent trajectories, with 10^ 
total iterations per trajectory. The step sizes tested began at (5t = 0.2 and were increased by 5% incrementally 
until all schemes became unstable. The relative error in the temperature (computed by averaging the 
momenta) is given in (a), with the scheme of Bussi/Parrinello giving a high order relationship with the step 
size. This is contrasted in (c) where the temperature is calculated using the instantaneous system position; 
here it is instead the BAOAB scheme that gives a high order result. The error in configurational distribution 
(calculated as a root mean squared deviance in the histogram bins) is shown in (b) , with a sample computed 
distribution given in (d) for 6t = 0.25. The exact distribution is shown as a dashed line, with the inset 
magnifying the density of the deepest well. 
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the grid corresponding to a {6t, 7) parameter set for a simulation. The parameters for the bottom- 
left point on this grid are 6ti = Ifs, 71 = 10~^/ps, where each grid point moving upward gives a 
20.7% increase in the friction value used, while each grid point moving right gives a 2.46% increase 
in step size. These ratios were chosen so as to give a broad range of parameter sets to test over, 
while ensuring that the range was not so wide as to yield a large number of unsuitable parameter 
sets (for example, using an unstable step size) leading to wasted computation. All the schemes 
were unstable for the maximum step size tested (^^50 = 3.29fs). 

The results of the simulations for each scheme are given in Figure 5, where we color points on 
the 50 X 50 grid of parameter sets to indicate the results from that respective simulation. Relative 
errors are calculated in the average total potential energy and the average total bond energies, 
where the "exact" comparison value is taken from averaging ten 2.5ns runs using the SPV scheme 
at 6t = 0.25fs, where it is expected that discretization error is not significant. 

With such a small simulation we would perhaps expect to see a very "noisy" result: high 
variances due to the sampling error vastly outweighing the discretization error. But in fact the 
discretization error dominates and is observable at step sizes significantly below the stability thresh- 
old. 

White pixels in the grids in Figure 5 represent a method's instability, showing that in general 
there is a small stability threshold increase for the large-friction case. There is no significant 
increase in this threshold between the methods however, with BAOAB, VGB and LI schemes 
giving a marginal increase over the others. 

One salient feature of the results of Figure 5, is that for the BAOAB scheme there is consistently 
less than a 1% error in the computed configurational temperature (for moderate friction) across 
all step sizes, even the largest stable timesteps tested. The relative errors obtained were so small 
that no discernable trend (with step size) can be shown, due to the sampling error, whereas the 
other schemes tested show an error consistent with second-order schemes (an example is given in 
Appendix C). 

Self-diffusion coefficients are calculated from integrating the computed velocity autocorrelation 
function, where a history is kept of the velocities for Ips. The values plotted in Figure 6 show 
that changing the step size within the indicated range using any of the schemes has only a very 
slight effect on the diffusion coefficient, while increasing the friction can dramatically reduce it. 
Examining the graphs, we settle on 7 = 1/ps as the largest value of 7 for which the diffusion 
coefficient is unperturbed for all the schemes. It is interesting that larger damping parameters do 
not substantially improve numerical stability for any of the methods, except in an extreme case 



18 



Relative error in 

average 

bond energy 



Relative error in 

average 

total potential 

energy 



Relative error in 

configurational 

temperature 




^ ^ 



100% 



- 10% 



\ 



1 1.5 2.25 3 1 1.5 2.25 3 

Step size (fs) Step size (fs) 



1 1.5 2.25 3 
Step size (fs) 



FIG. 5: Results from 2.5ns simulations of alanine dipeptide in vacuum at the given step size (horizontal) 
and friction (vertical). Pixels are colored according to relative errors for each simulation, with white pixels 
indicating instability. 
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for the VGB method (7 ^ 100/ps, where the diffusion constant is drasticahy reduced). 

Numerical values for the computed average potential energy are given in Table II for varying 
step size at 7 = 1/ps. 

VGB 



BAOAB 



ABOBA 



SPV 



LI 



BP 



BBK 




1 1.5 2.25 3 
Step size (fs) 

FIG. 6: The number of barrier recrossings (top) and the diffusion coefficients (bottom) are shown for each 
simulation, the latter in m^/s and computed by integrating the velocity autocorrelation function over an 
interval of Ips. As expected, the computed coefficients do not vary significantly between methods, though 
changing the friction above 1/ps has a substantial effect. 

If we accept, say, a 5% error tolerance for the average potential energy, we see that BAOAB 
admits a usable step size of up to 3fs, whereas the ABOBA and SPV schemes are restricted to a 
neighborhood of 2fs, with the usable timestep threshold for other methods well below 1.5fs. 

2. Solvated 



We immerse the alanine dipeptide molecule in a sphere of TIP3P water (10 A radius, total system 
is 424 atoms) and equilibrate for Ins at 300K to generate an initial configuration. We then run 
simulations using each scheme considered in the unsolvated case, using a lOA cutoff for electrostatics 
and van der Waals potentials. The value of friction was fixed at 1/ps, with runs performed with 
increasing step size. Initial timesteps were St = 2fs, with subsequent simulations increasing the step 
size by 5%, until reaching a step size where all of the methods fail. Each simulation was performed 
for 5ns at T = 300K using spherical (harmonically restrained) boundary conditions. Although 
the particular boundary conditions may not be representative of all biomolecular simulations, 
we contend that the crucial features of numerical stability and relative method performance are 
unaffected by the particular choice. 
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Scheme 


Average total potential energy 
(kcal/mol) 


Total number of observed 
recrossings 


5t = 1.5fs 


5t = 2fs 


5t = 2.5fs 


5t = 3fs 


5t = 1.5fs 


5t = 2fs 


5t = 2.5fs 


St = 3fs 


BAOAB 


1.65 ±0.04 


1.67 ±0.05 


1.68 ±0.03 


1.70 ±0.05 


821 ±17 


814 ±25 


823 ± 21 


798 ±11 


ABOBA 


1.69 ±0.05 


1.70 ±0.03 


1.75 ±0.03 


1.87 ±0.04 


823 ± 17 


829 ± 20 


826 ± 23 


833 ± 26 


SPV 


1.70 ±0.13 


1.70 ±0.05 


1.74 ±0.06 


1.88 ±0.03 


811 ±56 


822 ± 22 


838 ± 26 


835 ± 22 


VGB 


1.89 ±0.03 


2.14 ±0.05 


2.65 ±0.06 


4.27 ±0.04 


802 ± 19 


804 ± 28 


812 ±16 


813 ±17 


LI 


2.05 ±0.05 


2.42 ±0.04 


3.21 ±0.06 


5.84 ±0.06 


837 ± 33 


819 ±19 


822 ± 24 


821 ±27 


BP 


2.75 ±0.05 


3.91 ±0.06 


6.19 ±0.05 


14.0 ±0.18 


828 ± 27 


801 ± 19 


821 ±21 


818 ±41 


BBK 


2.78 ±0.03 


3.89 ±0.05 


6.21 ±0.06 


13.9 ±0.10 


826 ± 18 


825 ± 24 


834 ± 19 


835 ± 16 


Baseline 


1.66 ±0.04 


808 ± 28 



TABLE IL Numerical results for ten 2.5ns simulations of unsolvated alanine dipeptide, with friction set to 
7 = 1/ps. The mean and standard deviations of all simulations are given. The baseline comparison run was 
completed by averaging ten 2.5ns simulations using St = 0.25fs with the SPV scheme. Sampling error will 
play a large role in the determination of these averages, but it is clear that the BAOAB scheme outperforms 
the others by a significant margin. The number of observed recrossings is obtained by counting the number 
of times the central dihedral angles in the alanine dipeptide model hop between their two configurations. 



The results are given in Figure 7. Compared to the scheme of Bussi and Parrinello, the relative 
error in average total potential energy using the BAOAB scheme is seen to be smaller by two orders 
of magnitude when computed at a step size around 6t = 2.5fs. The surprising downward trend 
of the error using the BAOAB scheme could be indicative of higher-order terms dominating in 
the error expansion, showing that our asymptotic approach does not give definitive answers about 
bevavior in the large step size regime. The analytic results obtained for the discretization error are 
understood only for 6t ^ 0. More detailed analytical investigation of this phenomenon is beyond 
the scope of this article. 

The breakdown of results for all energy contributions is given in Appendix D. It is clear from 
these results that the average bond energy is a crucial component in explaining the results of Figure 
7. The average bond energy computed using the BAOAB scheme gives a flat profile with respect to 
the timestep increasing, whereas many other methods demonstrate an extreme drift approaching 
the stability threshold, causing a large error in the average total potential energy. The averages of 
other energies do not significantly contribute to the observed errors. 

The contribution of error coming from the restraining boundary condition energy was extremely 
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FIG. 7: Numerical results from 5ns runs of alanine dipeptide solvated in a 10 A sphere of TIP3P water are 
shown, using the given algorithms. Errors are computed against a baseline solution averaged from ten 5ns 
simulations using the SPV scheme at St = 0.5fs. Results from a single run are shown for each scheme except 
in the case of the BAOAB method. The BAOAB scheme shows an order of magnitude improvement in the 
error in computed average total potential energy; because of the small absolute errors, we exhibit the means 
and standard deviations from 10 runs for each step size used. The lack of an observed trend line for BAOAB 
suggests that the discretization error is being dominated by the sampling error. 

small, suggesting that the properties of the bulk water in the model are responsible for the dif- 
ferences in efficiency seen here. Hence we would expect the obseved corruption of averages to be 
generalizable to any simulations involving other boundary conditions, or other simulations involving 
water. 

VI. CONCLUSION 



We have studied a total of nine different integration methods for the Langevin dynamics equa- 
tions, including popular schemes that are in widespread use for molecular sampling. We have seen 
that some of these can be derived as splitting methods and in a few cases the perturbation of the 
invariant distribution has been determined in some regime (for example, the small step size, large 
friction limits). It is also possible to solve for the error in averages as a function of step size in 
the case of a harmonic oscillator, which we believe has direct relevance for biomolecular modelling 
where the bond stretches are modelled as harmonic restraints. Harmonic models are also likely to 
relate well to simulations of crystalline materials [34]. Our analyses show that a particular ordering 
of the building blocks of a splitting method, the BAOAB integrator [12], provides exact configu- 
rational averages for the harmonic oscillator and 4th order accurate configurational averages for 
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a general nonlinear model in the large friction limit. We have examined the performance of this 
method in relation to other schemes for toy models and for small biomolecular models both with 
and without solvent, with the observation that the analytical results on the error in distribution 
are highly correlated to their performance in practice. In particular, the BAOAB method performs 
very differently than the other methods in practical simulations, giving much higher accuracies 
(particularly for the configurational temperature) up to the Verlet stability threshold. 

A surprising observation is that discretization error, not sampling error, dominates in the simu- 
lations we performed, which involved a common small biomolecular test system and a time interval 
of only a few nanoseconds. 

Let us put the numerical results into perspective. Our simulations explore only a few of the 
available quantities that might be relevant for modelling. It is interesting that all of the second 
order schemes tested provided reasonable transition rates (in terms of barrier crossings), and so 
would give a similar rate of exploration of the phase space. Since molecular dynamics is often 
used for phase space exploration and supplemented by other techniques for precise averaging, the 
methods may have some utility regardless of the fact that they provide in some cases very poor 
approximation of configurational averages. It is also possible for a scheme to accurately resolve one 
quantity but not another (for example, in the case of unsolvated alanine dipeptide, the scheme of 
van Gunsteren and Berendsen gives reasonably good configurational temperatures but poor average 
potential energies). 

The ABOBA and SPV methods perform very similarly, and reasonably well, both for energy 
calculations and in terms of configurational temperature in all of the numerical experiments per- 
formed with alanine dipeptide. The similarity between these methods is a consequence of the fact 
that both are drift-kick-drift style algorithms, with a subtle difference in their "kick" updates: 
the ABOBA scheme solves (3) in a leapfrog manner by splitting off the force from the Ornstein- 
Uhlenbeck stochastic term, where as the SPV scheme solves equation (3) exactly for constant 
position q. One may expect that an exact solve would provide the method better properties, but 
in practice the leapfrog splitting in ABOBA is advantageous in the high friction regime. For large 
7, the result of solving exactly means that the momentum update becomes dominated by the 
noise, shrinking the contribution from the force term. The advantage of splitting up (3) is that 
this isolates the force from the noise, integrating it separately, making ABOBA (and indeed other 
schemes using the same splitting strategy, such as BAOAB and Bussi and Parrinello) effective for 
any value of 7 > 0. 

Using quantities based on the momenta to estimate temperature or diffusion constants is called 
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into question; certainly the connection between the accuracy of the kinetic energy average and the 
accuracy of other more directly relevant quantities is weak. The kinetic temperature measure is 
an accurate approximation of the true temperature in the case of the VGB scheme, but this same 
method gives relatively poor potential energy averages. 

We conclude by emphasizing that for bond energy and total potential energy averages, in vacuum 
simulation, the BAOAB method performs better than the other methods and is substantially better 
at large step sizes, giving a larger useful range of step size by a factor of at least 25% with an order 
of magnitude smaller errors at large step size. The differences are magnified still further when 
configurational temperatures are compared. 
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Appendix A: Implementation details of Figure 2 

We consider a single particle confined to the plane, with instantaneous horizontal and vertical 
position denoted x, ^ G M respectively. The particle feels a force with respect to the potential 
energy function 

1 + 5f 1 - exp f - 250 (y - 0.25)^ )) exp ( - lOOx^) 4 exp f - 20^^ 

U{x^y) = ^ ^ . A^ ^ ^ + 

^x^ + y'^ — 1 



qualitatively giving an energy surface with a narrow transition pathway between two basins. We 
seek to sample the canonical distribution pp(x,y) using Langevin dynamics, comparing the com- 
puted distributions given by the BAOAB and Bussi/Parrinello algorithms (given in the following 
section) at varying stepsizes. 

Figure 2 corresponds to numerical experiments using fixed friction constant 7 = 10 and tem- 
perature /3 = 1. From left to right, the stepsizes tested were 6t = [0.005,0.021,0.024,0.027,0.03]. 
Each image is a two dimensional histogram over the unit square centered on the origin, with 100 
equally spaced bins in both directions. The computed density shown for each timestep is averaged 
from 64 runs of 10^ steps. 

Appendix B: Numerical methods 

We present the numerical methods used in this article, assuming timestep 6t, friction constant 
7 and temperature T with diagonal mass matrix M and position and momentum vectors q^p re- 
spectively. The force is F{q) := —^U{q) and kB is Boltzmann's constant. Rn is a 3A^-vector 
of independent, identically distributed normal random numbers with zero mean and unit vari- 
ance. Where J > 1 random numbers are required per degree of freedom, multiple independent 
(uncorrelated) random vectors are denoted Rn \j = 1, . . . , J in the schemes. 
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BAOAB 

Note: This scheme is available in recent versions of NAMD (after Jan 2013) by including options 
Hangevin on'' and ^ langevinBAOAB on'' in the input parameter file. 



Pn+l/3 =Vn + -^F{qn), 
<ln+l/2 = qn+ -^M p„+i/3, 



ABOBA 



Pn+2/3 = e-^%n+i/i + ^/ ksT {I - e-'^'^^^)M^'^ R^, 

5n+l = 9n+l/2 + "^^ Pn+2/3) 
Pn+1 = Pn+2/3 + IT^l^n+l) 



9n+l/2 = 9n + ^M p„, 
Pn+1/3 = Pn + lT-f^(9n+l/2), 



Pn+2/3 = e-T"5*p„+i/3 + ^kBT{l - e-27<5*)Ml/2i?„, 
Pn+1 = Pn+2/3 + ^^(9n+l/2), 
g„+l = qn+l/2 + -jM pn+l 

Van Gunsteren/Berendsen (VGB) 
We must initialize the vector X, 

and then iterate 

1 — e~^^^ / \ 

Pn+i = e-^V + F{qn) + M (k+i - e-^^*K+i) 

g75t/2 _ g-7'5t/2 
Qn+1 = M~ Pn+1 + Xn+1 — Xn+1, 

7 
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where we use 3iV-vectors X, X, V, V. Constants Ki are given as 

Ki = -^fcBr(l-e-T"5*), 

K2 = r-^ T-Tz 



Z7 — 76' ' — 76 ' ' 

7^* - 3 + e-T"5* (4eT"5*/2 _ 1) ' 



7,5t (e^-S* - 1) - 4 (e^'5*/2 - l) ^ 

K3 = V^bT^ 75t - 3 + 6-75* (4e7'5*/2 - 1) ' 

K4 = 7"^\/fc^y7^t^^3Te-^^^^4^^^*/2^^, 

-1 ^^-^^ / 7(5t (e-7^^ - 1) + 4 (e-7^*/2 - if 

K6 = 7 vfe^ry g-^ji _ 1 • 

Stochastic Position Verlet (SPV) 



9n+l/2 = 9n + yM Pn, 

7 

9n+l = qn+1/2 + ^^ Pn+1 



?n+l/2) + ^lk^r{i^^^)M^I^Rn. 



Bussi/Parrinello (BP) 



Pn+2/4 = Pn+l/4 + -^-f' (,9nj, 

g„+l = Qn + <^tM-^p„+2/4, 
Pn+3/4 — Pn+2/4 + l^F{qn+l), 

Pn+1 = e-'^'^*/V+3/4 + ^fcBr(l-e-75*)MV242) 

Langevin Impulse (LI) 

We use the algorithm designed for configurational samphng; a correction term is given in [9] to 
improve the samphng of momenta, though this has no effect on configurational averages. We must 
initialize the 3A^-vector Z, 



Zi = M^/2 (aoRo + aRi] 
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and then iterate 



where 



Pn+l/i =Pn+'^StF{qn), 

Pn+2/4 = e"'^'^*/^ {Pn+1/4 + ^^n) , 

Qn+l ^qn+ ^^_^st/2 ^ Pn+2/i, 
Zn+i = M^/2 (^^^ ^ aRn+l) , 
Pn+3/4 = e-^''*/2p„+2/4 + <^Zn+l, 

Pn+1 = Pn+3/4 + '^F(g„+i), 



W= — ^r—, JTT, UJ = 1—UJ, 

75t(l-e-T"5*) ' 
a = ksT (2aj jSt + uj — Co) , 
b = UbT {2uL0j5t + u -co), 
c = ksT (2a)S(5t + co - co) , 



a = 2-i/2^c + a + V(c + a)2-462, 

ao = v a^ — c. 
Briinger/Brooks/Karplus (BBK) 

Pn+1 = (l + ^) {pn+1/2 + |i^(^n+l) + ^V^^^kBTStM^/^ Rn+l) , 

Ermak/McCammon (EM) 

As we consider only the scalar friction case, the update scheme for the position reduces to the 
Euler-Maruyama algorithm, with rescaled timestep.The update scheme for the momenta is unused 
in our numerical experiments, but can be found in [26]. We iterate 

Qn+i -qn + ^M-^F{qn) + V2D5tM-^'^Rn, 

kB-L 

where 

D = kBT/-f. 
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Ermak/Buckholtz (EB) 



1 — P~^^^ I 

1 — e "'•'''* 

Qn+l/S = 9n + M-^ {Pn+Pn+1 " 2-f-^F{qn)) /^ _^ ^-^St) ' 
9n+2/3 = ^n+l/S +7~"^'^^^~^^(9n), 



, ) 



Appendix C: Second order behavior of schemes 



We demonstrate the results for the relative error in configurational temperature, for simulations 
of alanine dipeptide in a vacuum at fixed friction 7 = 1.94/fs. This is equivalent to plotting a 
horizontal cross-section of the results given in Figure 5, at the corresponding friction value. We find 
that in all but one scheme a clear second order trend is visible; the exception is BAOAB which has 
much higher accuracy than the other methods and for which a trend line could not be resolved; we 
conjecture that for BAOAB the order of accuracy of the configurational temperature is substantially 
higher than two (consistent with our observations for one degree-of- freedom anharmonic cases). 
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Appendix D: Solvated results 



The breakdown of computed average energies are given for each method. The black dashed line 
marks the baseline solution for comparison. 
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